Model Overview

The demo.cdc.datenfab.com model service supports the following models, some of which require access requests.

Vendor	Model Name	Description
Meta	Meta-Llama-3-70B	Llama3 is Meta's latest large language model, excelling in nuanced language processing, context understanding, code generation, translation, and more. The Meta-Llama-3-70B version has 70 billion parameters, supports 8k context, and is suitable for Q&A and various natural language generation tasks.
Meta	Meta-Llama-3-8B	Llama3 is Meta's latest large language model, excelling in nuanced language processing, context understanding, code generation, translation, and more. The Meta-Llama-3-8B version has 8 billion parameters, supports 8k context, and is suitable for Q&A and various natural language generation tasks.
Mixtral	Mixtral-8x7B	The first high-quality sparse Mixture of Experts (MOE) model from Mistral AI, consisting of eight 7B parameter expert models. It outperforms Llama-2-70B and GPT-3.5 in multiple benchmarks and handles 32K context, with exceptional performance in code generation tasks.
Baichuan	Baichuan-13B-Chat	Baichuan-13B is an open-source commercial large-scale language model developed by Baichuan after Baichuan-7B. It achieves best-in-class results across Chinese and English benchmarks for models of its size.
Qwen	Qwen2-72B-Instruct	Qwen2 is the latest series of large language models from Qwen, featuring both foundational models and instruction-tuned models ranging from 0.5B to 72B parameters, including an expert Mixture of Experts (MOE) model. This repository contains the instruction-tuned 72B Qwen2 model.
Qwen	Qwen2-7B-Instruct	Qwen2 is the new version of the Qwen series of large language models, offering both foundational and instruction-tuned models ranging from 0.5B to 72B parameters, including an MOE model. This repository contains the instruction-tuned 7B Qwen2 model.
Qwen	Qwen1.5-14B-Chat	Qwen1.5 is a test version of Qwen2, retaining the decoder-only transformer architecture with SwiGLU activation, RoPE, and multi-head attention mechanisms. It features nine model sizes, enhancing multilingual and chat capabilities, with support for 32,768 token context length. All models include system prompts for role-play, with native transformer implementation support.
Qwen	Qwen1.5-110B-Chat	Qwen1.5 is a test version of Qwen2, retaining the decoder-only transformer architecture with SwiGLU activation, RoPE, and multi-head attention mechanisms. It features nine model sizes, enhancing multilingual and chat capabilities, with support for 32,768 token context length. All models include system prompts for role-play, with native transformer implementation support.
Qwen	gte-Qwen2-7B-Instruct	gte-Qwen2-7B-Instruct is the latest addition to the gte embedding series, building on the strengths of the Qwen1.5-7B model. This model incorporates several key improvements through advanced embedding training techniques.
DeepSeek	DeepSeek-V2-Chat	DeepSeek-V2 is a powerful MOE (Mixture of Experts) language model known for its economical training and efficient inference. It contains 236 billion parameters in total, with 21 billion parameters activated per token. Compared to DeepSeek 67B, DeepSeek-V2 offers superior performance while reducing training costs by 42.5%, cutting KV cache usage by 93.3%, and increasing maximum generation throughput by 5.76x.
Yi	Yi-1.5-34B-Chat	Yi-34B is a bilingual large language model developed and open-sourced by Zero One Infinity. It is trained with a 4K sequence length and can extend to 32K during inference. Yi-34B achieves top performance across multiple benchmarks, setting new state-of-the-art (SOTA) records in international evaluations.
Zhipu AI	ChatGLM3-6B-32K	ChatGLM3, co-developed by Zhipu AI and Tsinghua University's KEG lab, is the next-generation dialogue pre-trained model. The ChatGLM3-6B model retains the conversational fluency and low deployment threshold of previous versions while featuring a stronger base model, more complete functionality, and comprehensive open-source support.
Zhipu AI	GLM-4-9B-Chat	GLM-4-9B is the latest open-source version in the GLM-4 series by Zhipu AI, capable of multi-turn dialogue. GLM-4-9B-Chat also supports advanced features like web browsing, code execution, custom tool invocation (Function Call), and long-text inference with up to 128K context.
BAAI	BGE-M3	BGE M3-Embedding, developed by BAAI and the University of Science and Technology of China, excels in multi-linguality, multi-functionality, and multi-granularity. M3-Embedding supports over 100 working languages, 8192-length input text, and offers dense, multi-vector, and sparse retrieval, achieving excellent hybrid recall performance.